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ABSTRACT 

The effect of process variation (PV) on delay is a 
major reason to decay the performance in advanced 
technologies. The performance of front routing 
algorithms is determined with or without PV for 
different traffic patterns. The saturation throughput 
and average message delay are used as performance 
metrics to evaluate the throughput. PV decreases the 
saturation throughput and increases the average 
message delay. Adaptive routing algorithm should be 
manipulated with the PV. A novel PV delay and 
congestion aware routing (PDCR) algorithm is 
presented for asynchronous network-on-chip (NOC) 
design. The routing algorithm performs various 
adaptive routing algorithms in the average delay and 
saturation throughput for different traffic patterns. A 
low-power content-addressable memory (CAM) by a 
new algorithm is proposed for associativity between 
the input tag and the corresponding address of the 
output data. The proposed architecture is depends on a 
recently developed sparse clustered network by 
utilizing binary connections that on-average 
eliminates most of the parallel comparisons performed 
during a search. 

Keywords: Asynchronous design, congestion, network 
on chip (NoC), process variation (PV), routing 
algorithms 

I. INTRODUCTION 

International Technology roadmap for 
Semiconductors presented the process variation (PV) 
parameters as a critical challenge for manufacturing 


of IC. Systematic and random variations are two 
sources for PV with technology scaling down random 
variation becomes significantly larger than systematic 
variation. Random variation appears in logic gates and 
interconnectors. The impact of random PV emerged 
on low and high levels of designs. 

One of the key factors of designing network on chip 
(NOC) is the routing algorithm. An efficient routing 
algorithm is required to achieve high performance. 
Hence, ignoring the impact of PV during the design of 
any routing algorithm results in unexpected average 
message delay and saturation throughput is used as 
two metrics to evaluate the performance of a routing 
algorithm. The saturation throughput occurs when no 
additional messages can be injected successfully to 
the network. Prior to the saturation throughput, the 
average message delay slightly increases with the 
injection load. However, the average message delay 
increases exponentially beyond the network situation. 
As a hardware solution, a new router design is existed 
to mitigate PV impact. A variation-adaptive variable- 
cycle router configures its cycle latency adaptively 
according to the spatial PV for increasing the network 
frequency in the asynchronous network. Adaptive 
routing algorithm for multi-core NOC architectures is 
presented for reducing saturation bandwidth 
degradation which is caused by PV’s. The source 
routing algorithm is introduced for enhancing the 
speed of communication in an NOC based on PV. 

To the best of our knowledge, the work presented in 
this paper is the first work to investigate the impact of 
PV on different routing algorithms. Moreover, an 
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adaptive routing algorithm that is aware of the PV and 
congestion for asynchronous NOC designs is 
introduced in this paper. 


In this paper, a novel adaptive routing algorithm is 
proposed for asynchronous NOC designs to reduce 
the effect of PV. The presented algorithm is 
applicable with any source of PV. The technique is 
insensitive for the source of the variation. The novel 
routing algorithm utilizes the PV and congestion 
information as metrics to select the suitable output 
port (OP). In addition, the realistic values of average 
message delay and saturation throughput under high 
PV for different routing algorithms are compared with 
that of nominal value (without PV). 


II. EXISTED SYSTEM 

The existed algorithm can be divided into two 
procedures: 1) Determining target node (TN) and 2) 
selection criterion for the OP. The details of the 
procedure are described in sections. 


1. Determining the Target Node (TN): At source 
router, a random intermediate (IM) router is chosen 
between the source and the destination as an IM 
station during the message trip. Thereby the message 
has two phases (ph-1, ph-0) when it is routed from the 
source to the destination. At ph-0 the message is 
routed from the source to the IM node ph-1 is used 
when the message is forwarded from the IM router to 
the destination router. In PDCR, a uniform random 
distribution function is used to select a random IM 
router between the source and destination. In addition, 
phase (PH) and IM fields are added into each message 
to retain the values of the message phase and the IM 
router identification (Id). 



Fig 1: Exploiting the same path for more than one 

time 


Each router needs to declare the TN whether it is the 
IM or destination router. When each router forwards 
the message to the TN, it applies XY and YX routing 
algorithms to calculate the OP direction (i.e., N=0). 
The integer value of the output direction is denoted by 
Pxy when XY routing algorithm is used. Pyx denotes 
the integer value of the output direction when YX 
routing algorithm is used to route the message for TN. 
The default value of ph field of the message is set to 
zero. However, ph field of the message is assured 
from phO to phi in one of the following cases: 

1) If the current router is the IM router; 

2) If the current router exists in the same row of the 
destination router (rx==dx); 

3) If the current router exists in the same column of 
the destination router (ry==dy); 

Where the coordinates of current router are rx for X 
coordinator and ry for Y coordinator. In addition, dx 
is utilized for the X coordinator of the destination 
node and dy is used for the Y coordinator of the 
destination node. If one of the three conditions is true, 
this is sufficient to make ph field equal to one, and 
hence the TN is assigned to the destination router ID. 
On the other hand, when none of the three conditions 
is achieved, ph field is equals zero and hence the TN 
is assigned IM field of the message. 

2. Selection Criterion for the Output Port (OP): 

After applying XY and YX routing algorithms, PDCR 
distinguishes between these two output direction 
(Pxy,Pyx) bases on the congestion and Delay with PV 
(DPV). At each router, the congestion (Cxy) of the 
neighbor router and the DPV between the current 
router and the neighbor router (if XY routing 
algorithm is used) are compared with the congestion 
(Cyx) of the neighbor router and the DPV of (yxPV) 
between the currR and the neighbor router (if yx 
routing algorithm is used). By comparing two ports 
using six parameters there are three main scenarios 
that should be handled. 



Fig 2: example for same direction of OP 
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III. PROPOSED SYSTEM 

In a conventional CAM array, each entry consists of a 
tag that, if matched with the input, points to the 
location of a data word in a static random access 
memory (SRAM) block. The actual data of interest 
are stored in the SRAM and a tag is simply a 
reference to it. Therefore, when it is required to search 
for the data in the SRAM, it suffices to search for its 
corresponding tag. Consequently, the tag may be 
shorter than the SRAM-data and would require fewer 
bit comparisons. 

An example of a typical CAM array, consisting of 
four entries having 4 bits each, is shown in Fig. 3. A 
search data register is used to store the input bits. The 
register applies the search data on the differential SLs, 
which are shared among the entries. Then, the search 
data are compared against all of the CAM entries. 
Each CAM-word is attached to a common match line 
(ML) among its constituent bits, which indicates, 
whether or not, they match with the input bits. Since 
the MLs are highly capacitive, a sense amplifier is 
typically considered for each ML to increase the 
performance of the search operation. 



Fig. 3. 4x4 CAM array 


As an example, in TLBs, the tag is the virtual page 
number (VPN), and the data are the corresponding 
physical page number (PPN). A virtual address 
generated by the CPU consists of the VPN, and a page 
offset. The page offset is later used along with PPN to 
form the physical address. Since most translation 
look-aside buffers (TLBs) are fully associative, in 
order to find the corresponding PPN, a fully parallel 
search among VPNs is conducted for every generated 
virtual address. 

A BCAM cell is typically the integration of a 6- 
transistor (6T) SRAM cell and comparator circuitry. 


The comparator circuitry is made out of either an 
XNOR or an XOR structure, leading to a NAND-type 
or a NOR-type operation, respectively. The selection 
of the comparing structure depends on the 
performance and the power requirements, as a 
NAND-type operation is slower and consumes less 
energy as opposed to that of a NOR type. 

In a NAND-type CAM, the MLs are precharged high 
during the precharge phase. During the evaluation 
phase, in the case of a match, the corresponding ML is 
pulled down though a series of transistors performing 
a login NAND in the comparison process. In a NOR- 
type CAM the MLs are also precharged high during 
the precharge phase. However, during the evaluation 
phase, all of the MLs are pulled down unless there is a 
matched entry such that the pull-down paths M3-M4 
and M5-M6 are disabled. Therefore, a NOR-type 
CAM has a higher switching activity compared with 
that of a NAND type since there are typically more 
mismatched entries than the matched ones. Although 
a NAND-type CAM has the advantage of lower 
energy consumption compared with that of the NOR- 
type counterpart, it has two drawbacks: 1) a quadratic 
delay dependence on the number of cells due to the 
serial pull-down path and 2) a low noise margin. 

IV. RESULTS 
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Fig 5: Technology Schematic 
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Fig 6: Output Waveform 
V. CONCLUSION 

Delay variation in logic gates and inter connect is 
produced as a result of PV which impacts NOC 
design. The delay variation is a major reason to 
deteriorate the performance of routing algorithms. 
The PV decreases the saturation throughput and 
increases average message delay relative to nominal. 
This paper presents the first study of the influence of 
PV on different routing algorithms. In this paper, the 
algorithm and the architecture of a low-power CAM 
are introduced. The proposed architecture employs a 
novel associativity mechanism based on a recently 
developed family of associative memories. CAM is 
suitable for low-power applications, where frequent 
and parallel look-up operations are required. 
Depending on the application, non-uniform inputs 
may result in higher power consumptions, but does 
not affect the accuracy of the final result. In other 
words, a few false-positives may be generated by the 
SCN-based classifier, which are then filtered by the 
enabled CAM sub-blocks. Therefore, no false- 
negatives are ever generated. 
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